-
Notifications
You must be signed in to change notification settings - Fork 19
Chunking #187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Oh, you're ahead of me #188 |
|
An anomaly for sure... |
Yes, not sure about the parallelization. Nor using skip. I think some combination of the two would be good. I forgot to use @value for instance |
|
Well I was trying to find the simplest way to not make copies and also support multi threading |
I just think then you have the extra loop. Wheras mine just does on loop still. But I'm not sure if it matters much. |
|
well I am hoping that the multi threaded bit means that when you have large number of channels being indexed |
|
@tynanford do you want to share your opinion |
|
tagging @conorschofield as well should I don't know enough java to have an opinion on skip or how to structure the loops. We also do the same as ESS and re-populate the entire CF instance every so often. All the CF data is stored in IOCs or in a matlab script which adds MML meta-data to CF . So parallelization sounds good to me. |
|
I'm wondering if some of the tests are breaking because there is no longer a consistent ordering of the returned channels. |
I based it on the default elastic window size. 10 000 seems to be fine most of the time anyway. We hit the limit at a 170 000 IOC, I calculated that 110 000 is around where the problem is so I think 10 000 is a good default. |
|
|
I don't think that is the case with the manual IT tests. Maybe we can have one preference for both/all the chunking operations. |
Rename to repository.chunk.size
|
|




Break very large index and update request into smaller chunks
Address the issue #186